Exploiting Chinese character models to improve speech recognition performance

نویسندگان

James Hieronymus

Xunying Liu

Mark J. F. Gales

Philip C. Woodland

چکیده

The Chinese language is based on characters which are syllabic in nature. Since languages have syllabotactic rules which govern the construction of syllables and their allowed sequences, Chinese character sequence models can be used as a first level approximation of allowed syllable sequences. N -gram character sequence models were trained on 4.3 billion characters. Characters are used as a first level recognition unit with multiple pronunciations per character. For comparison the CUHTKMandarin word based system was used to recognize words which were then converted to character sequences. The character only system error rates for one best recognition were slightly worse than word based character recognition. However combining the two systems using log-linear combination gives better results than either system separately. An equally weighted combination gave consistent CER gains of 0.1 0.2% absolute over the word based standard system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Syllable language models for Mandarin speech recognition: exploiting character language models.

Mandarin Chinese is based on characters which are syllabic in nature and morphological in meaning. All spoken languages have syllabiotactic rules which govern the construction of syllables and their allowed sequences. These constraints are not as restrictive as those learned from word sequences, but they can provide additional useful linguistic information. Hence, it is possible to improve spee...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

متن کامل

Modeling Pronunciation Variation for Bi-Lingual Mandarin/Taiwanese Speech Recognition

In this paper, a bi-lingual large vocaburary speech recognition experiment based on the idea of modeling pronunciation variations is described. The two languages under study are Mandarin Chinese and Taiwanese (Min-nan). These two languages are basically mutually unintelligible, and they have many words with the same Chinese characters and the same meanings, although they are pronounced differen...

متن کامل

A Radical Approach to Handwritten Chinese Character Recognition Using Active Handwriting Models

This paper applies active handwriting models (AHM) to handwritten Chinese character recognition. Exploiting active shape models (ASM), the AHM can capture the handwriting variation from character skeletons. The AHM has the following characteristics: principal component analysis is applied to capture variations caused by handwriting, an energy functional on the basis of chamfer distance transfor...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Exploiting Chinese character models to improve speech recognition performance

نویسندگان

چکیده

منابع مشابه

Syllable language models for Mandarin speech recognition: exploiting character language models.

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

Modeling Pronunciation Variation for Bi-Lingual Mandarin/Taiwanese Speech Recognition

A Radical Approach to Handwritten Chinese Character Recognition Using Active Handwriting Models

عنوان ژورنال:

اشتراک گذاری